Speech and Hand Transcribed Retrieval

نویسندگان

  • Mark Sanderson
  • Xiao Mang Shou
چکیده

This paper describes the issues and preliminary work involved in the creation of an information retrieval system that will manage the retrieval from collections composed of both speech recognised and ordinary text documents. In previous work, it has been shown that because of recognition errors, ordinary documents are generally retrieved in preference to recognised ones. Means of correcting or eliminating the observed bias is the subject of this paper. Initial ideas and some preliminary results are presented. General Terms Measurement, Experimentation.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Spoken document retrieval by translating recognition candidates into correct transcriptions

This paper proposes an ad hoc retrieval method for spoken documents that uses a statistical translation technique. After transcribing the spoken documents by using a Large-Vocabulary Continuous Speech Recognition (LVCSR) decoder, a text-based ad hoc retrieval method can be directly applied to the transcribed documents. However, recognition errors will signi cantly degrade the retrieval performa...

متن کامل

Language Modeling Approach for Retrieving Passages in Lecture Audio Data

Spoken Document Retrieval (SDR) is a promising technology for enhancing the utility of spoken materials. After the spoken documents have been transcribed by using a Large Vocabulary Continuous Speech Recognition (LVCSR) decoder, a text-based ad hoc retrieval method can be applied directly to the transcribed documents. However, recognition errors will significantly degrade the retrieval performa...

متن کامل

Combining Multiple Models for Speech Information Retrieval

In this article we present a method for combining different information retrieval models in order to increase the retrieval performance in a Speech Information Retrieval task. The formulas for combining the models are tuned on training data. Then the system is evaluated on test data. The task is particularly difficult because the text collection is automatically transcribed spontaneous speech, ...

متن کامل

Operational Assessment of Keyword Search on Oral History

This project assesses the resources necessary to make oral history searchable by means of automatic speech recognition (ASR). There are many inherent challenges in applying ASR to conversational speech: smaller training set sizes and varying demographics, among others. We assess the impact of dataset size, word error rate and term-weighted value on human search capability through an information...

متن کامل

Inference Network-based Indonesian Spoken Query Information Retrieval

Query term misrecognition caused by the speech recognizer is one of the important issues in the spoken query information retrieval. The misrecognized term in the transcribed query leads to the retrieval of irrelevant documents. To raise the correct ranking of the retrieved documents, we use a speech recognition confidence score based on word posterior probability to weight the term in the infer...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001